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Abstract 

We introduce an algebra of instruction sequences by presenting a 
semigroup C in which programs can be represented without directional 
bias: in terms of the next instruction to be executed, C has both for- 
ward and backward instructions and a C-expression can be interpreted 
starting from any instruction. We provide equations for thread extrac- 
tion, i.e., C's program semantics. Then we consider thread extraction 
compatible (anti-)homomorphisms and (anti-)automorphisms. Finally 
we discuss some expressiveness results. 

1 Introduction 

In this paper three types of mathematical objects play a basic role: 

1. Pieces of code, i.e., finite sequences of instructions, given some set I 
of instructions. A (computer) program is in our case a piece of code 
that satisfies the additional property that each state of its execution 
is prescribed by an instruction (typically, there are no jumps outside 
the range of instructions). 

2. Finite and infinite sequences of primitive instructions (briefly, SPIs), 
the mathematical objects denoted by pieces of code (in particular 
by programs). Primitive instructions are taken from a set U that 

1 Section Theoretical Computer Science, Informatics Institute, University of Amster- 
dam. The authors acknowledge support from the NWO project Thread Algebra for Strate- 
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(possibly after some renaming) is a strict subset of X. The execution 
of a SPI is single-pass: it starts with executing the first primitive 
instruction, and each primitive instruction is dropped after it has been 
executed or jumped over. 

3. Threads, the mathematical objects representing the execution behav- 
ior of programs and used as their program semantics. Threads are 
defined using polarized actions and a certain form of conditional com- 
position. 

While each (computer) program can be considered as representing a 
sequence of instructions, the converse is not true. Omitting a few lines of 
code from a (well-formed) program usually results in an ill-formed program, 
if the remainder can be called a program at all. Before we discuss the in- 
struction sequence semigroup mentioned in the title of this paper we briefly 
consider "threads", the mathematical objects representing the execution be- 
havior of programs, or, more generally, of instruction sequences. Threads 
as considered here resemble finite state schemes that represent the execu- 
tion of imperative programs in terms of their (control) actions. We take an 
abstract point of view and only consider actions and tests with symbolic 
names (a, b, . . .): 

In this picture, 

[ a ] models the execution of action a and 
its descent leads to the state thereafter 
(and likewise for [c]); 

( b } models a the execution of test action 
b; its left descent models the "true-case" 
and its right one the "false-case" 
(and likewise for (d)); 

S is the state that models termination. 

Finite state threads as the one above can be produced in many ways, and 
a primary goal of program algebra (PGA) is to study which primitives and 
program notations serve that purpose well. The first publication on PGA 
is the paper [7]. A basic expressiveness result states that the class of SPIs 
that can be directly represented in PGA (the so-called periodic SPIs) corre- 
sponds with these finite state threads: each PGA-program produces upon 
execution a finite state thread, and conversely, each finite state thread is 
produced by some PGA-program. 
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In this paper we introduce a set of instructions that also suits the above- 
mentioned purpose well and that at the same time has nice mathematical 
properties. Together with concatenation — its natural operation — it forms a 
semigroup with involutions that we call C (for "code"). A simple involu- 
tive anti-automorphisrrH transforms each C-program into one of which the 
interpretation from right to left produces the same thread as the original 
program. Furthermore we define some homomorphisms and automorphisms 
that preserve the threads produced by C-expressions, thereby exemplifying 
a simple case of systematic program transformation. We generalize this 
approach by defining bijections on finite state threads and describe the 
associated automorphisms and anti-automorpisms on C, which all are gen- 
erated from simple involutions. Finally, we study a few basic expressiveness 
questions about C. 

The paper is structured as follows: In Section[2]we review threads in the 
setting of program algebra. Then, in Section [3] we introduce the semigroup 
C of sequences of instructions that this paper is about. In Section 2] we 
define thread extraction on C, thereby giving semantics to C-expressions: 
each C-expression produces a finite state thread. In Section [5] we define 'C- 
programs' and show that these are sufficient to produce finite state threads. 
Furthermore, only certain test instructions in C are necessary to preserve 
C's expressive power. 

Section [6] is about a thread extraction preserving homomorphism on C 
and a related anti-homomorphism. Then, in Section [7] we define a natural 
class of bijections on threads and establish a relation with a class of auto- 
morphisms on C, and in Section [8] we do the same thing with respect to a 
related class of anti-automorphisms on C. 

In Section [9] we further consider C's instructions in the perspective 
of expressiveness and show that restricting to a bound on the counters of 
jump instructions yields a loss in expressive power. In Section [10] we use 
Boolean registers to facilitate easy programming of finite state threads, and 
in Section [TT] we relate the length of a C-program to the number of states 
of the thread it produces. 

In Section [12] we discuss C as a context in which some fundamental 
questions about programming can be further investigated and come up with 
some conclusions. 

The paper is ended with an appendix that contains some background 
information (Sections [A] [B] and [C]) . 

2 We refer to [11] as a general reference for algebraic notions. 
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2 Basic Thread Algebra 



In this section we review threads as they emerge from the behavioral ab- 
straction from programs. Most of this text is taken from [14]. 

Basic Thread Algebra (BTA) is a form of process algebra which is 
tailored to the description of sequential program behavior. Based on a set 
A of actions, it has the following constants and operators: 

• the termination constant S, 

• the deadlock or inaction constant D, 

• for each a G A, a binary postconditional composition operator _<!aD>_. 

We use action prefixing a o P as an abbreviation for P <a>P and take o to 
bind strongest. Furthermore, for n > 1 we define a n o P by a 1 o P = a o P 
and a n+1 o P = a o (a n o P). 

The operational intuition is that each action represents a command 
which is to be processed by the execution environment of the thread. The 
processing of a command may involve a change of state of this environment H 
At completion of the processing of the command, the environment produces 
a reply value true or false. The thread P < a > Q proceeds as P if the 
processing of a yields true, and it proceeds as Q if the processing of a yields 
false. 

Every thread in BTA is finite in the sense that there is a finite upper 
bound to the number of consecutive actions it can perform. The approxi- 
mation operator n : N x BTA — > BTA gives the behavior up to a specified 
depth. It is defined by 

1. 7r(0,P) = D, 

2. vr(n + l,S) = S, n(n + l, D) = D, 

3. ir(n + l,P<a>Q) = vr(n, P) < a > vr(n, Q), 

for P,Q G BTA and n € N. We further write 7r n (P) instead of 7r(n, P). We 
find that for every P £ BTA, there exists an n G N such that 

TT n (P) = 7T n+1 (P) = ■ ■ ■ = P. 

3 For the definition of threads we completely abstract from the environment. In Ap- 
pendix [C] we define services which model (part of) the environment, and thread-service 
composition. 
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Following the metric theory of [T] in the form developed as the basis 
of the introduction of processes in [5] , BTA has a completion BTA°° which 
comprises also the infinite threads. Standard properties of the completion 
technique yield that we may take BTA 00 as the cpo consisting of all so-called 
projective sequences Q 



BTA 00 = {(P n )„ eN | Vn G N (P„ G BTA & 7r n (P n+1 ) = P n )}. 

For a detailed account of this construction see [3] or [15] . On BTA 00 , equal- 
ity is defined componentwise: (P n ) n eN = (Qn)neN if for all n £ N, P n = Q n . 

Overloading notation, we now define the constants and operators of 
BTA on BTA 00 : 

1. D = (D,D,...) and S = (D,S,S,...); 



The elements of BTA are included in BTA 00 by a mapping following this 
definition. E.g., 

a o S 1— > (P ra ) ne N with Pq = D, Pi = a o D and for n > 2, P n = a o S. 

It is not difficult to show that the projective sequence of P G BTA thus 
defined equals (ir n (P)) n£ N. We further use this inclusion of finite threads 
in BTA°° implicitly and write P,Q, . . . to denote elements of BTA 00 . 

We define the set Res(P) of residual threads of P inductively as follows: 

1. P G Res(P), 

2. Q < a > R G Res(P) implies Q G Res(P) and R G Res(P). 

A residual thread may be reached (depending on the execution environment) 
by performing zero or more actions. A thread P is regular if Res(P) is finite. 
Regular threads are also called finite state threads. 

A finite linear recursive specification over BTA°° is a set of equations 



for i G I with / some finite index set, variables Xi, and all U terms of the 
form S, D, or Xj < a > x& with j, G J. Finite linear recursive specifications 
represent continuous operators having unique fixed points |15j . 

4 The cpo is based on the partial ordering IZ denned by D C P, and P IZ P', Q C Q' 
implies P<a><3CP'<a>(3'. 



2. (Pn)nGN < O > (Qn)nGN = (Pn)neN with 
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Theorem 1. For all P G BTA°°, P is regular iff P is the solution of a 
finite linear recursive specification. 

Proof. Suppose P is regular. Then Res(P) is finite, so P has residual 
threads P\ , . . . , P n with P = P\. We construct a finite linear recursive 
specification with variables x\, . . . , x n as follows: 

D if Pi = D, 

S if Pi = S, 

Xj < a > x k if Pi = Pj < a > P fc . 

For the converse, assume that P is the solution of some finite linear 
recursive specification E with variables x%, ■ ■ ■ ,x n . Because the variables in 
E have unique fixed points, we know that there are threads P\, . . . ,P n G 
BTA°° with P = Pi, and for every i G {1, . . . , n}, either Pi = D, Pj = S, or 
P i = Pj < G > P fc for some j, fc G {1, . . . , n}. We find that Q G Res(P) iff 
Q = Pi for some i 6 {l,...,n}. So Res(P) is finite, and P is regular. □ 

Example 1. The regular threads a 2 oD and a°° = aoao - ■ ■ are the respective 
fixed points for x\ in the finite linear recursive specifications 

1. {xi = a o x 2 , x 2 = a o x 3 , x 3 = D} ; 

2. {x\ = a o x\}. 

In reasoning with finite linear recursive specifications, we shall often 
identify variables and their fixed points. For example, we say that P is the 
thread defined by P = a o P instead of stating that P equals the fixed point 
for x in the finite linear recursive specification x = a o x. In this paper we 
write 



reg 

for the set of regular threads in BTA°°. 

An elegant result based on [2] is that equality of recursively specified 
regular threads can be easily decided. Because one can always take the dis- 
joint union of two finite linear recursive specifications it suffices to consider 
a single finite linear recursive specification {Pj = t{ \ 1 < i < n}. Then 
Pj = Pj follows from 7r n _i(Pj) = 7r n _i(Pj). Thus, it is sufficient to decide 
whether two certain finite threads are equal. In Appendix [B] we provide a 
proof sketch. 
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3 C, a Semigroup for Code 



In this section we introduce the sequences of instructions that form the 
main subject of this paper. We call these sequences "pieces of code" and 
use the letter C to represent the resulting semigroup. The set A of actions 
represents a parameter for C (as it does for BTA). 

For a G A and k ranging over N + (i.e., N \ {0}), C-expressions are of 
the following form: 



/a +/a -/a /#k \a +\a -\a \#k 



# 



P:P 



In C the operation ";" is called concatenation and all other syntactical 
categories are called C -instructions: 

I a is a forward basic instruction. It prescribes to perform action a and 
then (irrespective of the Boolean reply) to execute the instruction 
concatenated to its right-hand side; if there is no such instruction, 
deadlock follows. 

+/a and —/a are forward test instructions. The positive forward test in- 
struction +/a prescribes to perform action a and upon reply true 
to execute the instruction concatenated to its right-hand side, and 
upon reply false to execute the second instruction concatenated to 
its righthand side; if there is no such instruction to be executed, dead- 
lock follows. For the negative forward test instruction —/a, execution 
of the next instruction is prescribed by the complementary replies. 

is a forward jump instruction. It prescribes to execute the instruc- 
tion that is k positions to the right and deadlock if there is no such 
instruction. 

\a, +\a, —\a and \#k are the backward versions of the instructions men- 
tioned above. For these instructions, orientation is from right to left. 
For example, \a prescribes to perform action a and then to execute 
the instruction concatenated to its left-hand side; if there is no such 
instruction, deadlock follows. 

! is the termination instruction and prescribes successful termination. 

# is the abort instruction and prescribes deadlock. 
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For C there is one axiom: 

(X;Y);Z = X;(Y;Z). (1) 

By this axiom, C is a semigroup and we shall not use brackets in repeated 
concatenations. As an example, 

+/«; h\#2 

is considered an appropriate C-expression. The instructions for termination 
and deadlock are the only instructions that do not specify further control 
of execution. 

Perhaps the most striking aspect of C is that its sequences of instruc- 
tions have no directional bias. Although most program notations have a 
left to right (and top to bottom) natural order, symmetry arguments clarify 
that an orientation in the other direction might be present as well. 

It is an empirical fact that imperative program notations in the vast 
majority of cases make use of a default direction, inherited from the nat- 
ural language in which a program notation is naturally embedded. This 
embedding is caused by the language designers, or by the language that 
according to the language designers will be the dominant mother tongue 
of envisaged programmers. None of these matters can be considered core 
issues in computer science. 

The fact, however, that imperative programs invariably show a default 
directional bias itself might admit an explanation in terms of complexity 
of design, expression or execution, and C provides a context in which this 
advantage may be investigated. 

Thus, in spite of an overwhelming evidence of the presence of directional 
bias in 'practice' we propose that the primary notation for sequences of 
instructions to be used for theoretical work is C which refutes this bias. 
Obviously, from C one may derive a dialect C by writing a for /a, +a for 
+/a, —a for —/a and j^k for /#/c. Now there is a directional bias and in 
terms of bytes, the instructions are shorter. As explained in Section [5j the 
instructions \a, +\a and —\a can be eliminated, thus obtaining a smaller 
instruction set which is more easily parsed. One may also do away with a 
and —a in favor of +a, again reducing the number of instructions. Reduction 
of the number of instructions leads to longer sequences, however, and where 
the optimum of this trade off is found is a matter which lies outside the 
theory of instruction sequences per se. We further discuss the nature of C 
in Section CGJ 
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4 Thread Extraction and C-Expressions 



In this section we define thread extraction on C. For a C-expression X, 
\X\~* denotes the thread produced by X when execution started at the 
leftmost or "first" instruction, thus \..\~~* is an operator that assigns a thread 
to a C-expression. We prove that this is always a regular thread. We also 
consider right-to-left thread extraction where thread extraction starts at the 
righmost of a C-expression. 

We will use auxiliary functions \XL with j ranging over the integers Z 
and we define 

\xr = \x\i, 

meaning that thread extraction starts at the first (or leftmost) instruction 
of X. For j £ Z, \X\j is defined in Table [TJ 

Let X = ii; . . . ;i n and j G Z. 
For j G {1, . . . ,n}, 



if ij 
if ij 
if ij 
if ij 



/a, 

+/a, 

-/a, 




a o \X\j_i 
\X\j-i < a > |X 
|X|j_ 2 < a > |X 



li-i 



!i-2 



if Zj- 

if ij- 
if ij- 
if ij- 



\a, 

+\a, 
-\a, 



S 



if ij 
if ij- 




# 



for {!,..., n}, |X| i = D. 



Table 1: Equations for thread extraction 
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A special case arises if these equations applied from left to right define 
a loop without any actions, as in 

\/#2;/a;\#2\ 1 = \/#2;/a;\#2\ 3 
= |/#2;/o;\#2|i. 

For this case we have the following rule: 

If the equations in Table [TJ applied from left to right yield (3) 
a loop without any actions the extracted thread is D. 

Rule ([3|) applies if and only if a loop in a thread extraction is the result of 
consecutive jumps to jump instructions. 

In the following we show that thread extraction on C-expressions pro- 
duces regular threads. For a C-expression X we define i(X) € N + to be the 
length of X, i.e., its number of instructions. 

Theorem 2. If X is a C-expression and i € Z, then \X\{ defines a regular 
thread. 

Proof. Assume X is a C-expression with £(X) = n. If i {1, . . . , n}, then 
\X\i = D by rule ([2]). In the other case, a single application of the matching 
equation in Table [T] determines for each i £ {1, . . . , n} an equation of the 
form 

\X\i = \X\j<a>\X\ k , ov\X\i = \X\j, or|X|j = D, or \X\i = S (4) 

where by rule ([2]) we may assume that all expressions \X\j and \X\k oc- 
curring in the right-hand sides satisfy j, k 6 {1, . . . , n} (otherwise they are 
replaced by D). We construct n linear equations X{ = t{ with the property 
that \X\i as given by the rules for thread extraction is a fixed point for xf 

1. Define Xj = ti from @ by replacing each \X\j by Xj. 

2. Determine with Rule ([3]) all equations \X\i = \XL that define a loop 
without actions, and replace all associated equations X{ = Xj by 

Xi = D. 

3. Replace any remaining equation of the form Xi = Xj by 

&i — tj 

where tj is the right-hand side of the equation for xj. Repeating 
this procedure exhaustively yields a finite linear specification with 
variables x\, . . . ,x n . 



10 



For each i € {1, . . . ,n} the thread defined by thread extraction on \X\i is a 
fixed point for Xj. Hence \X\~~* is a regular thread, and so is |X|^. □ 

Given some C-expression X, we shall often use \X\i as the identifier of 
the thread defined by \X\i as meant in Theorem [21 and similar for \X\~ \ 
As an example of thread extraction, consider the C-expression 

X = /a;+/b;\c;+/d; ! ; \#5 (5) 

It is not hard to check that X produces the regular thread Pi (i.e., \X\~~* = 
Pi) defined bj@ 

Pi = a o P 2 

P 2 = P 3 < b > Pi 
P 3 = C oP 2 
P 4 = p 5 < d > Pi 
P 5 = S 

Thread extraction defines an equivalence on C-expressions, say X =_> Y if 
l-X"! - * = l^l - \ that is not a congruence, e.g., 

# =_/#! but #;/o /a. 



We define right-to-left thread extraction, notation 



as the thread extraction that starts from the rightmost position of a piece 
of code: 

\ x \*~ = \ x \e(x) 

where i(X) E N + is the length of X, i.e., its number of instructions. Tak- 
ing X as defined in Example ([SJ, we find \X\*~ = \X\~^ because for that 
particular X, \X\q = \X\i. Right-to- left thread extraction also defines an 
equivalence on C-expressions, say X =<_ Y if \X\*~ = \Y\*~ , that is not a 
congruence, e.g., 

# =^\#1 but /o;# ^/a;\#l. 
5 This regular thread Pi can be visualized as was done in Section [T] 
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5 Expressiveness of C-Programs 



In this section we introduce the notion of a 'C-program'. Furthermore we 
discuss a basic expressiveness result: we show that each regular thread is 
the thread extraction of some C-program. Finally we establish that we do 
not need all of C's instructions to preserve expressiveness. 

Definition 1. A C-program is a piece of code X = i\\ . . . ; i n with n > 
such that the computation of \X\j for each j = 1, . . . ,n does not use equa- 
tion ([2]) . In other words, there are no jumps outside the range of X and 
execution can only end by executing either the termination instruction ! or 
the abort instruction # . 

In the setting of program algebra we explicitly distinguished in [9] a 
"program" from an instruction sequence (or a piece of code) in the sense 
that a program has a natural and preferred semantics, while this is not the 
case for the latter one. Observe that if X and Y are C-programs, then so 
is X; Y. A piece of code that is not a program can be called a program 
fragment because it can be extended to a program that yields the same 
thread extraction. This follows from the next proposition, which states 
that position numbers can be relativized. 

Proposition 1. For k £ N and X a C -expression, 
1- \X\k = |# ;X\ k+ i, 
\x\k = 1^; # Ifc- 

Moreover, in the case that X is a C-program and 1 < k < i(X), 

3. \x\ k = \/#t,x\-», 

4. \X\ k = \X;\#£(X) + l-kr. 

With properties 1 and 2 we find for example 

|+/a;\#2r = |+/a;\#2|i 

= |#;+/a;\#2;#| 2 , 

and since the latter piece of code is a C-program, we find with property 3 
another one that produces the same thread with left-to-right thread extrac- 
tion: 

|# ; +/o; \#2; # j 2 = |/#2; # ; +/a; \#2; # |"\ 
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Of course, for property 3 to be valid it is crucial that X is a C-program: for 
example 

[+/a;\#2r = |+/a;\#2[! 

+ |/#l;+/a;\#2|-\ 

A similar example contradicting property 4 for X not a C-program is easily 
found. 

Theorem 3. Each regular thread in T reg is produced by a C-program. 

Proof. Assume that a regular thread Pi is specified by linear equations 
Pi = t%, . . . , P n = t n . We transform each equation into a piece of C-code: 

Pi = S ^ ! ; # ; # , 
Pi = D » # ; # ; # , 

-/a;/#p;/#q i£p,q>0, 

-A*;/#M#(-?) ifp>0, g<0, 
-A*;\#(-p);/#9 ifp<o, g>o, 
7«;\#(-p);\#(-g) ifp,g<o, 



Pi = < o > P fc 



where p = 3(j — i) — 1 and g = 3(/c — i) — 2 (so p,q & Z\ {0}). Concatenating 
these pieces of code in the order given by Pi, . . . , P n yields a C-expression X 
with \X\~~* = P\. By construction X contains no jumps outside the range 
of instructions and therefore X is a C-program. Finally, note that the 
instructions of X are in the set {+/a, /#k, \#k, ! , # | a G A, k G N + }. □ 

From the proof of Theorem [3] we infer that only positive forward test 
instructions, jumps and termination are needed to preserve C's expressive- 
ness: 

Corollary 1. Let C~ be defined by allowing only instructions from the set 

{+/a, /#k, \#k, ! | a G A, k G N + }. 
Then each regular thread in T reg can be produced by a program in C~ . 

Proof. With # added to the instruction set mentioned, the result follows 
immediately from the proof of Theorem [3j The use of # in that proof can 
easily be avoided, for example by setting 

Pi = S » ! ; /#1; \#1 (instead of ! ; # ; # ), 

Pi = D » /#!; /#!; \#1 (instead of # ; # ; # ). 
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The resulting expression clearly contains no jumps outside its range and is 
hence a C-program. □ 



6 Thread Extraction Preserving Homomorphisms 

In this section we consider functions on C that preserve thread extraction. 
We start with a homomorphism that turns all basic and test instructions 
into their forward counterparts, and another one that only yields positive 
forward test instructions. Then we consider an anti-homomorphism that 
relates extraction with right-to-left thread extraction. So, these functions 
are very basic examples of program transformation. 

Let the function h : C — > C be defined on C-instructions as follows: 

/a^/a;/#2;#, 
+/a^+/a;/#2;/#4, 
-/a^-/a;/#2;/#4, 

\a -> /a; \#4; # , 
+\a^+/a;\#4;\#8, 
-\a~-/a;\#4;\#8, 
\#k^\#3k;#;#, 

# ^#;#;#- 

So, h replaces all basic and test instructions by fragments containing only 
their forward counterparts. Defining 

h(X;Y) = h(X);h(Y) 

makes h an injective homomorphism (a 'monomorphism') that preserves the 
equivalence obtained by (left-to-right) thread extraction, i.e., 

\xr = \Hx)r. 

This follows from the more general property 
\X\ j+1 = \h(X)\ 3j+1 
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for all j < i(X), which is easy to prove by case distinction. So, \X\~~* = 
\h k (X)\~^, and, moreover, if X is a C-program, then so is h k (X). 

Of course many variants of the homomorphism h satisfy the latter two 
properties. A particular one is the homomorphism obtained from h by 
replacement with the following defining clauses: 

/a->+/a;/#2;/#l, 
-/a^+/a;/#5;/#l, 

\a^+/a;\#4;\#5, 
-\a^+/a;\#4;\#8, 

because now only forward positive test instructions occur in the homomor- 
phic image. In other words: with respect to thread extraction, C's expres- 
sive power is preserved if its set of instructions is reduced to 

{+/a,/#k,\#k, !,# \a€ A, k € N + }. 

This is the syntactic counterpart of Corollary Q] in Section [5) 
Let g : C — > C be defined on C-instructions as follows: 



/a i- 


-#;\#2;\a, 


+/a h 


^\#4;\#2;+\a 


— /a h 


+ \#4;\#2;-\a 


/#A;h 


+ #;#;\#3A;, 


\a h 


-#;/#4;\a, 


+\a h 


-/#8;/#4;+\a 


— \a i- 


+ /#8;/#4;-\a 




^#;#;/#3fe, 


! H 


-#;#; !, 




-#;#;#• 



So, <7 replaces all basic and test instructions by C-fragments containing only 
their backward counterparts. Defining g(X;Y) = g(Y);g(X) makes g an 
anti-homomorphism that satisfies 

\xr = \ g (x)r. 

This follows from a more general property discussed in Section 
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7 Structural Bijections and TEC- Automorphisms 

In this section we define structural bijections on the finite state threads over 
A as a natural type of (bijective) thread transformations. We then describe 
and analyze the associated class of automorphisms on C, which appear to 
be generated from simple involutions. 

Given a bijection on A (thus a permutation of A) and a partitioning 
of A in A tTue and A f&lse , we extend to a structural bijection on BTA by 
defining for all a G A and P, Q G BTA, 

0(D) = D, 

<MS) = s, 

(0(P)<0(a)>0(Q) if 0(a) G A tIue , 
U(Q) < 0(a) > <j>(P) if 0(a) G A false . 



0(P < a > Q) 



Structural bijections naturally extend to T reg : if Pj is a fixed point for 
in the finite linear specification {xi = U(x) \ i = 1, . . . , n}, then 0(Pj) is 
a fixed point for yi in 

{Vi = <f>(ti{x)) \ i = l,...,n, 4>{xi) = yi}. (6) 
As an example, assume that 0(a) = b G ^faise and thread P is given by 

P = P<a>Q, Q = D 
then P' = 0(P) is defined by 

P' = Q' <b\> P', Q' = D. 

Theorem 4. There are 2^ A \ ■ \ A\\ structural bijections on BTA ; and thus on 
T 

reg- 

Proof. Trivial: if \A\ = n, there are 2™ different partitionings in A tTue and 
Aigise, and n\ different bijections on A. □ 

Each structural bijection can be written as the composition of a (pos- 
sibly empty) series of transpositions or 'swaps' (its permutation part) and a 
(possibly empty) series of postconditional 'flips' that model the false-part 
of its partitioning. So, for a fixed there exist k and m such that 

4> = flip Cl ° • • • ° fliPcm ° swapa^h ° • • • ° 'swapa k ,b k 
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where swap a b models the exchange of actions a and 6, and flip c the postcon- 
ditional flips for ^4f a ise = {ci, • • • , c m }, and is the identity if k = m = 0. 
More precisely, 



swap a ,b{ p ) <c> swap ab (Q) with 



c = b if c = a, 
c = a if c = b, 
c = c otherwise, 



and 



»P »c(0)3»>» c (P) If.-* 

_ |/%,(P) < a > flip c (Q) otherwise. 

For A = {a%, . . . , a n } we can do with n — 1 swaps swap ai >a (1 < j < n) as 
these define any other swap by map^. = swap ai)a . oswap^^ oswWp aua ., 
and n flips /^p a . (1 < i < n). 

We show that structural bijections naturally correspond with a certain 
class of automorphisms on C. 

Definition 2. An automorphism a on C is thread extraction compatible 

(TEC) if there exists a structural bijection (3 such that the following diagram 
commutes: 

C > T re ^ 

n ^ ~ > t 

Theorem 5. TTie TEC- automorphisms on C are generated by 

swap a b : exchanges a and b in all instructions containing a or b, 
flip a ■ exchanges + and — in all test instructions containing a, 

where a and b range over A. 

Proof. First we have to show that if a is generated from swap a b and flip a 
(a, b € A), then a is a TEC-automorphism. This follows from the fact that 
the diagram in Definition [2] commutes for swap a b if we take (3 = swap a b 
and for flip a if we take (3 = flip a - We show this below. 
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Then we have to show that if a is a TEC-automorphism, then a is 
generated from swaps and flips. Above we argued that each structural 
bijection can be characterized by zero or more swap a b and flip a applications. 
So, again it suffices to argue that for (3 = swap a b , the diagram commutes 
if a = swap a b and for = flip c if a = flip c - The general case follows from 
repeated applications. 

Let X G C. First assume j3 = flip c . Following the construction in 
the proof of Theorem [2] we find a finite linear specification {xj = ti | i = 
1, . . . , n} with n = £(X) such that \X\i is a fixed point for Xj. Transforming 
this specification according to ([6]) with (j> = flip c yields {yi = flip c (ti(x)) \ 
i = 1, . . . , n, flip c (xi) = yi}. Now \flip c {X)\i is a fixed point for %){■. this also 
follows from the construction in the proof of Theorem [2] and the fact that 
flip c only changes the sign of ±/c and ±\c in X. 

We now show that flip c (\X\i) is a fixed point for yi by a case distinction 
on the form of ti in the equations Xi = ti {i = 1, . . . , n): 

• If Xi = Xj < c > Xk then \X\i = \X\j < c > \X\k, so 

jfy c {\X\ i )=fl& e (\X\ j <c>\X\ k ) 

= Wp c (\x\ k ) <c>Wp c (\x\ 3 ). 

Note that in this case yi = yt <j c > j/j . 

• If Xj = Xj < a > Xfc with a ^ c, then = < a > |-X"|fc, so 

^ c (|Xli) = ^ c (|X| i <a> \X\ k ) 

= flk c {\X\,j)<a\>fl^ c {\X\ k ). 

Note that in this case yi = yj < a>y k - 

• If Xj = S, then = S and yj = S. Also /Zip c (|X|j) = S. 

• If U = D, then \X\i = D and y, = D. Also ^p c (|X|j) = D. 

So in all cases flip c (\X |j) is a fixed point for yj. Hence, |/?ip c (X)|i = 
and thus 1^(1)1^=^(1X1^). 
In a similar way it follows that \swap a ^(X)\i = swap a b (\X\i). □ 

Note that swap a a is the identity and so is flip a o flip a . Furthermore, 
for a j^z b we have swap a b = swap b a and 



swap b o fli Pc 



flip c o swap a b if c {a, 6}, 
flip d o swap a 6 if {a, b} = {c, d}. 
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This implies that each TEC-automorphism can be represented as 



flip Cl o ... oflip Cm o swap aubl 0...0 swap ak>bk . 

Similarly as remarked above, for A = {a\, . . . ,a n } we can do with n — 1 
swaps swap ai a . (1 < j < n) as these define any other swap. 

We further write TEC-AUT for the set of TEC-automorphisms, and 
we say that swap a b and the structural bijection swap a b are associated, and 
similar for flip a and flip a - So, the above result states that for the associ- 
ated pair a 6 TEC-AUT and structural bijection a the following diagram 
commutes: 




I a la 




The following corollary of Theorem [5] follows immediately. 

Corollary 2. If a G TEC-AUT, then a preserves the orientation of all 
instructions and a(i) = i for i G {/#k,\#k, ! | k € N + }. Further- 
more, for each a £ A, a is determined by its value on one of the possible 
four test instructions. If for example a(+/a) = —/b, then a(/a) = /b, 
a{—/a) = +/b, and the remaining identities are given by replacing all for- 
ward slashes by backward slashes. 

Each element a € TEC-AUT that satisfies a 2 (u) = u for all C- 
instructions u is an involution, i.e. 

a 2 {X) =X. 

Obvious examples of involutions are swap a h and flip c , and a counter-example 
is 

a = flip b o swap a:b 
because 

a 2 = flip b o swap ab oflip b o swap ab = flip b oflip a oswap ab oswap ab = flip b aflip a . 
However, a 2 is an involution (because compositions of flip commute). 
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8 TEC- Anti- Automorphisms 

In this section we consider the relation between structural bijections on 
threads and an associated class of anti-automorphisms on C. Recall that a 
function <p is an anti-homomorphism if it satisfies (p(X;Y) = 4>(Y); 4>{X). 
Furthermore, we show how the monomorphism h defined in Section [6] is 
systematically related to the anti-homomorphism g defined in that section. 

Define the anti-automorphism rev : C — > C (reverse) on C-instructions 
by the exchange of all forward and backward orientations: 



/a h 


\a, 


+/a h 


■* +\a, 


— /a h 


■* ~\a, 




+ \#*, 


\a h 


-* /a, 


+\a h 


■* +/a, 


—\a h 


-> ~/a, 






! H 

# - 





Then rev 2 (X) = X, so rev is an involution. Furthermore, it is immediately 
clear that for all X £ C, 

= |reu(X)| < -. 

Definition 3. An anti- automorphism a on C is thread extraction com- 
patible (TEC) if there exists a structural bisection (3 such that the following 
diagram commutes: 



C 




lot IP 



We write TEC-AntiAUT for the set of thread extraction compatible 
anti-automorphisms on C. The following result establishes a strong connec- 
tion between TEC-AUT and TEC-AntiAUT. 
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Theorem 6. TEC-AntiAUT = {rev o Q |a6 TEC-AUT}. 

Proof. Let 7 G TEC-AntiAUT, so 7 is an anti-automorphism and there 
is a structural bijection (5 such that |7(X)| <— = (3{\X\~^) for all X. By 
Theorem^ /? = a for some q G TEC-AUT and /3(|X|-») = |a(X)|^ for all 
X, and thus 

| 7 (X)| < - = |reu o a(JQr~ for all X. (7) 

This defines 7 on {/#*;, \#Jfe, ! ,# I * G ^+}. By Corollary El a is de- 
termined by its definition on all positive forward test instructions. So, if 
for a, 6 £ A, a(+/a) = ±/6 then we find by with X = +/a; ! that 
7 (— \a) = t\^- Since a is determined for all other instructions containing 
a, also 7 is fully determined for all instructions containing a. It follows that 
7 = reu o a, thus 7 G {rev oa | a £ TEC-AUT}. 

Conversely, if 7 G {rev o a \ a £ TEC-AUT}, say 7 = rev o a with 
a G TEC-AUT, then |7(X)|*~ = |a(X)p = for some structural 

bijection (3 and all X. Furthermore, 7 is an anti-automorphism, so 7 G 
TEC-AntiAUT. □ 

Observe that for all a G TEC-AUT , a o rew = rew o a and for all 
a, (3 G TEC-AntiA UT, ao(3 G TEC- A UT . Using the notation for associated 
pairs we find for (3 = rev o a G TEC-AntiAUT that the following diagram 
commutes: 

C — ► T re g 

1/3 la 

C — * T re g 

Note that we use a, i.e., the associated structural bijection of a, in this 
diagram. 

Another application with rev is the following: for h : C — * C a 
monomorphism, the following diagram commutes: 

q revoh 

C > TT re g 

As an example, consider the anti-homomorphism g defined in Section [6) 
indeed g = rev o h for the homomorphism h defined in that section. 
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9 Expressiveness and reduced instruction sets 



In this section we further consider C's instructions in the perspective of 
expressiveness. We show that setting a bound on the size of jump counters 
in C does have consequences with respect to expressiveness: let 

Ck 

be defined by allowing only jump instructions with counter value k or less. 

We first introduce some auxiliary notions: following the definition of 
residual threads in Section [2j we say that thread Q is a O-residual of thread 
P if P = Q, and an n + 1-residual of P if for some a G A, P = Pi < a > P2 
and Q is an n-residual of Pi or of P<i- Note that a finite thread (in BTA) 
only has n-residuals for finitely many n, while for the thread P defined by 
P = a o P it holds that P is an n-residual of itself for each n G N. 

Let a G A be fixed and n G N + . Thread P has the a-n-property if 
ir n (P) = a n o D and P has 2 n — 1 (different) n-residuals which all have a 
first approximation not equal to aoD. So, if a thread P has the a-n-property, 
then n consecutive a-actions can be executed and each sequence of n replies 
leads to a unique n-residual. Moreover, none of these residual threads starts 
with an a-action (by the requirement on their first approximation). We note 
that for each n G N + we can find a finite thread with the a-n-property. In 
the next section we return to this point. 

A piece of code X has the a-n-property if for some i, \X\i has this 
property. It is not hard to see that in this case X contains at least 2 n — 1 
different a-tests. As an example, consider 

X = ! ; \6; +\a; +/a; \#2; +/o; /#2; /c; # 

Clearly, X has the a-2-property because \X\4 has this property: its 2- 
residuals are 60S, S, D and c o D, so each thread is not equal to one of 
the others and does not start with an a-action. 

Note that if a piece of code X has the a-(n + A;)-property, then it also 
has the a-n-property. In the example above, X has the a-l-property because 
|X|3 has this property (and \X\q too). 

Lemma 1. For each k G N there exists n G N + such that no X G has 
the a-n-property. 
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Proof. Suppose the contrary and let k be minimal in this respect. Assume 
for each n £ N + , Y n £ has the a-n-property. 

Let B = {true, false}. For a, (3 6 B* we write 

a ■< 

if a is a prefix of (3, and we write a < (3 ov [3 >- a \i a < (3 and a / (3. 
Furthermore, let 

n 

5<™ = \JB\ 

i=0 

thus B- n contains all -B*-sequences a with £(a) < n (there are 2 n+1 — 1 
such sequences). 

Let g : N — > N be such that |^ n | 9 ( n ) has the a-n-property. Define 

fn : 5-™ - N+ 

by fn( a ) = m if the instruction reached in Y n when execution started at 
position g{n) after the replies to a according to a has position m. Clearly, 
f n is an injective function. 

In the following claim we show that under the supposition made in this 
proof a certain form of squeezing holds: if k' is sufficiently large, then for 
all n > there exist a,(3,-y G B k ' with f k / +n (a) < f k '+ n (P) < fk'+nil) 
with the property that fk'+ni^) < fk'+n(P') < fk'+n{l) for each extension 
f3' of f3 within B- k +n . This claim is proved by showing that not having 
this property implies that "too many" such extensions (3 1 exist. Using this 
claim it is not hard to contradict the minimality of k. 

Claim 1. Let k' satisfy 2 k ' > 2k + 3. Then for all n > there exist 
a, (3, 7 G B k with 

fk'+n(a) < fk'+niP) < fk'+n{l) 

such that for each extension [3' >z (3 in B- k ' +n , 

fk'+n(a) < fk'+n{!3') < fk'+n{l)- 

Proof of ClaimUl Let k' satisfy 2 k ' > 2k + 3. Towards a contradiction, 
suppose the stated claim is not true for some n > 0. The sequences in B k 
are totally ordered by fk>+n, say 

fk'+n(oil) < fk'+n(ct2) < ■ ■ ■ < fk'+n(a 2 k')- 
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Consider the following list of sequences: 

ai,(X2, • • • , «2fc+2, «2fc+3 

S „ ' 

choices for (3 

By supposition there is for each choice € {«2, • • • ,02^+2} an extension 
(3' y p in £< fc '+™ with 

either f k t +n (/3') < f k >+ n (ai), or fk'+nW) > fk> +n\ ( ^2k+'i) ■ 

Because there are 2k + 1 choices for f3, assume that at least k + 1 elements 
(3 G {02, • • • , 02^+2} have an extension with 

fk'+nifl') < fk'+n( a l) 

(the assumption fk'+nifi') > /fc'+n(«2fc+3) for at least k + 1 elements /3 with 
extension /?' leads to a similar argument). Then we obtain a contradiction 
with respect to fk'+n'- fo r each of the sequences j3 in the subset just selected 
and its extension (3', 

fk'+n{0) < fk'+niptl) < fk'+n(P), 

and there are at least k+1 different such pairs (3, (3' (recall fk'+n is fnjective). 
But this is not possible with jumps of at most k because the fk'+n values of 
each of these pairs define a path in Yk'+ n that never has a gap that exceeds 
k and that passes position fk'+ n (ai), while different paths never share a 
position. This finishes the proof of Claim [TJ □ 

Take according to Claim [T] an appropriate value k' , some value n > 
and a, (3, 7 € B k . Consider Yy+n an d mark the positions that are used for 
the computations according to a and 7: these computations both start in 
position g{k! + n) and end in fk'+n{oi) and fk'+n(l), respectively. Note that 
the set of marked positions never has a gap that exceeds k. 

Now consider a computation that starts from instruction fk'+n{@) m 
Yfc' +n , a position in between f k i +n (a) and fk'+nin)- By Claim [H the first 
n a-instructions have positions in between f k i +n (a) and f^+nil) an d none 
of these are marked. Leaving out all marked positions and adjusting the 
associated jumps yields a piece of code, say Y, with smaller jumps, thus 
in Cfc_i, that has the a-n-property. Because n was chosen arbitrarily, this 
contradicts the initial supposition that k was minimal. □ 
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Theorem 7. For any k € N + , not all threads in BTA can be expressed in 
Cfc. This is also the case if thread extraction may start at arbitrary positions. 

Proof. Fix some value k. Then, by Lemma [I] we can find a value n such 
that no X € has the a-n-property. But we can define a finite thread 
that has this property. □ 

In the next section we discuss a systematic approach to define finite 
threads that have the a-n-property. 



10 Boolean Registers for Producing Threads 

In this section we briefly discuss the use of Boolean registers to ease pro- 
gramming in C. This is an example of so-called thread- service composition. 
In appendix[C]we provide a brief but general introduction to thread-service 
composition. 

Consider Boolean registers named 61, 62, . . . , bn which all are initially 
set to F (false) and can be set to T (true). We write bi(b) with 6 € {T,F} 
to indicate that 6z's value is 6. The action bi.set:b sets register bi to 6 and 
yields true as its reply. The action bi.get reads the value from register bi 
and provides this value as its reply. The defining rules for threads in BTA 
that use one of these registers are for 6, 6' G {T, F}, i £ {l,...,n}: 

S ju bi(b) = S, 
D l u bi(b) = D, 
(P < bi.set-.b' > Q) ju bi(b) = P lu bi(b'), 

(P< bi.get >Q)hibi(b) = {^ /hi ^ 

[Q lu Hb) 

and, if none of these rules apply, 

(P<a\>Q) lu bi{b) = (P lu bi(b)) <a>(Q / w bi(b)). 

The operator /u is called the use operator and stems from [8]. Observe 
that the requests to the service bi do not occur as actions in the behavior 
of a thread-service composition. So the composition hides the associated 
actions. 



if b = T, 
if 6 = F, 
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As a simple example consider the C-program X that has extra instruc- 
tions based on the set {bi.set:b,bi.get \ b £ {T, F}, i G {1,2}}: 

X= +/a;/bl.set:T; 
+/a; /b2.set:T; 
+/bl.get; c; d; 
+/b2.get;c;d; ! 

Then one can derive (recall the initial value of 61 and 62 is F): 

(\Xr I hi 61) / b2 62 = (|X| 3 lu 61(T) < a > |X| 3 / w 61(F)) / M 62 
= < a > R2) < a > (-R3 < a > i? 4 ) 

where i?i = codocodoS (case T,T), R2 = c o d o d o S (case T, F), 
i?3 = docorfo S (case F, T) , and R4 = d o d o S (case F, F) . So, the 
four possible combinations of the values of 61 and 62 yield the different 2- 
residuals R\, . . . , -R4. Clearly, X has the a-2-property. The particular form 
of the C-program X already suggests how to generalize X to a family of 
C-programs Z n (n G N + ) such that 

(d^l - * / 6J 61)...) / 6n 6n 

has the a-n-property: 

Z n =+/a; /bl.set.T; 
+/a; /b2.set:T; 

+/a; /bn.set.T; 
+/bl.get; c;d; 
+/b2.get; c;d; 

+/bn.get; c;d; ! 

Each series of n replies to the positive testinstructions +/a has a unique 
continuation after which Z n terminates successfully: the number of true- 
replies matches the number of c-actions, and their ordering that of the 
occurring d-actions. Obviously, each thread /bi 61)...) /&„ bn is a 

finite thread in BTA and can thus be produced by a C-program not using 
Boolean registers (cf. Theorem [3]). 

More information about thread-service composition is given in Ap- 
pendix o 
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11 On the Length of C-Programs for Producing 
Threads 



C-programs can be viewed as descriptions of finite state threads. In this 
section we consider the question which program length is needed to produce 
a finite state thread. We also consider the case that auxiliary Boolean 
registers are used for producing threads, which can be a very convenient 
feature as was shown in the previous section. We find upper and lower 
bounds for the lengths of C-programs. 

For k, n € N + let 

ip(k,n) € N + 

be the minimal value such that each thread over alphabet a±, . . . , at with 
at most n states can be expressed as a C-program with at most ip(k,n) 
instructions. Furthermore, let 

ip br (k,n) G N + 

be the minimal value such that each thread over alphabet a\, . . . , with 
at most n states can be expressed as a C-program with at most ij)b r (k,n) 
instructions including those to use Boolean registers. 
It is not hard to see that 

i/j(k,n) < 3n and ipb r (k,n) < 3n 

because each state can be described by either the piece of code 

+/di;u;v 

with u and v jumps to the pieces of code that model the two successor 
states, or by ! or # . Presumably, a sharper upper bound for both ip(k, n) 
and il>b r (k,n) can be found. 

As for a lower bound for ipi, r (k,n), we can use auxiliary Boolean regis- 
ters by forward basic instructions 

/bi.set:T 
/bi.set-.F 
/bi.get 
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and their backward and test counterparts. So, each Boolean register bi 
comes with 18 different instructions, and of course at most ipi) r (k,n) of 
these can be used. 

Programs containing at most / = ipb r (k,n) instructions, contain per 
position i at most I — 1 jump instructions, namely jumps to all other (at 
most I — 1) positions in the program. 

So, if we restrict to k = 1, say /a is the only forward basic instruction 
involved (with backward and test variants yielding 5 more instructions) and 
include the termination instruction ! and the abort instruction # , the 
admissible instruction alphabet counts 

2 + 6 + (I - 1) + 18/ 

instructions. Because / > 1, this is bounded by 26/ instructions, and there- 
fore we count 

(26/)' 

syntactically different programs. 

A lower bound on the number of threads with n states over one action 
a can be estimated as follows: let F range over all functions 

{l,...,n-l}^{0,l,...,n-l}, 

thus there are n n ~ 1 different F. Define threads P F for k = 0, . . . , n — 1 by 

P F = S 
Pf +1 = P£ (i+1) <a>Pf 

We claim that for a fixed n the threads P^-i (each one containing n states 
Pq , . . . , Pn-i): are f° r each F different, thus yielding n n_1 different threads, 
so we find 

(26/)' > n n -\ (8) 

72 

Assume n > 2, thus 26 < 25n, thus n < 26n — 26, thus — < n — 1. Suppose 
- ' - - 26 - 

Tl 

I < — , then 26/ < n and I < n — 1, which contradicts (JSj) . Thus 
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So, for k = 1 and in fact for arbitrary k > 1 we find 

ft 

— < ip br (k,n) < 3n. 

In the case that we do not allow the use of auxiliary Boolean registers, 
it follows in a same manner as above that for arbitrary k > 1, 

Tl 

— < tp(k, n) < 3n. 
8 

We see it as a challenging problem to improve the bounds of ipbr(k,n) 
and ip(k, n). 



12 Discussion 



In this paper we proposed an algebra of instruction sequences based on a set 
of instructions without directional bias. The use of the phrase "instruction 
sequence" asks for some rigorous motivation. This is a subtle matter which 
defeats many common sense intuitions regarding the science of computer 
programming. 

The Latin source of the word 'instruction' tells us no more than that 
the instruction is part of a listing. On that basis, instruction sequence is a 
pleonasm and justification is problematic |f| We need to add the additional 
connotation of instruction as a "unit of command" . This puts instructions 
at a core position. Maurer's paper A theory of computer instructions [12] 
provides a theory of instructions which can be taken on board in an attempt 
to define what is an instruction in this more narrow sense. Now Maurer's 
instructions certainly qualify as such but his survey is not exhaustive. His 
theory has an intentional focus on transformation of data while leaving 
change of control unexplained. We hold that Maurer's theory, including 
his ongoing work on this theme in [13], provides a candidate definition for 
so-called basic instructions. 

At this stage different arguments can be used to make progress. Sup- 
pose a collection I is claimed to constitute a set of instructions: 

6 [10] : INSTRUCTION, in Latin instructio, comes from in and struo to dispose or 
regulate, signifying the thing laid down. 

The following is taken from |http : //www . etymonline . com/] INSTRUCTION: from 
O.Fr. instruction, from L. instructionem (nom. instructio) "building, arrangement, teach- 
ing," from instructus, pp. of instruere "arrange, inform, teach," from in- "on" + struere 
"to pile, build" (see structure). 
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1. If the mnemonics of elements of X are reminding of known instructions 
of some low level program notations, and if the semantics provided 
complies with that view, the use of these terms may be considered 
justified. 

2. If, however, unknown, uncommon or even novel instructions are in- 
cluded in X, the argument of Q] can not be used. Of course some 
similarity of explanation can be used to carry the jargon beyond con- 
ventional use. At some stage, however, a more intrinsic justification 
may be needed. 

3. A different perspective emerges if one asserts that certain instruction 
sequences constitute programs, thus considering X + (i.e., finite, non- 
empty sequences of instructions from I) one may determine a subset 
V C Z + of programs. Now a sequence in Z + qualifies as a program if 
and only if it is in V . In the context of C-expressions we say that 

+/a;\#10;/6;+/c;/#8; !; ! 

is not in V because the jumps outside the range of instructions cannot 
be given a natural and preferred semantics, as opposed to +/a; \#1; ! 
and +/a;/6;+/c; ! ; ! . We here state once more that we do not 
consider the empty sequence of instructions as a program, or even as 
an instruction sequence because we have no canonical meaning or even 
intuition about such an empty sequence in this context. 

4. The next question is how to determine V . At this point we make use 
of the framework of PGA [14] (for a brief explanation of PGA see 
Appendix[A|). A program is a piece of data for which the preferred and 
natural meaning is a "sequence of primitive instructions" , abbreviated 
to a SPI. Primitive instructions are defined over some collection A 
of basic instructions. The meaning of a program X is by definition 
provided by means of a projection function which produces a SPI for 
X. Using PGA as a notation for SPIs, the projection function can be 
written p2pga ("V to PGA"). The behavior \X\ V for X £ V is given 
by 

\X\ V = | P 2pga(X)| 
where thread extraction in PGA, i.e., [...[, is supposed to be known. 
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5. In the particular case of X consisting of C"s instructions, we take 
for V those instruction sequences for which control never reaches 
outside the sequence. These are the sequences that we called C- 
programs. First we restrict to C-programs composed from instructions 
in {/a,+/a, -/a, /#k, \#k, ! , # [ a G A, k <E N + } and we define 

F(h; . . . ; i n ) = . . . ; ^{i n )T 

as a "pre-projection function" that uses an auxiliary function t/j on 
these instructions: 



</>(/«) 


= a, 


V>(+/a) 


= +o, 


rP(-/a) 


= -a, 


4>U#k) 


= #k, 


ljj(\#k) 


= #n — k 




— ! 

• J 




= #0. 



We can rewrite each C-program into this restricted form by applying 
the behavior preserving homomorphism h defined in Section [6l Thus 
our final definition of a projection can be p2pga = F o h. Note that 
many alternatives for h could have been used as well (as was already 
noted in Section [6]) . 

6. Conversely, each PGA-program can be embedded into C while its 
behavior is preserved. For repetition free programs this embedding is 
defined by the addition of forward slashes and replacing ^0 by # III 
In the other PGA-program can be embedded into PGLB, a 

variant of PGA with backward jumps and no repetition operator [7], 
and transformation from PGLB to C is trivial. 

In the case of C, items H] and [5] above should of course be proved, i.e., for a 
C-program X, 

\xr = \X\ C (= |p2pga(X)|), 

and for item [6] a similar requirement about the definition of | . . . |~* should 
be substantiated. We omit these proofs as they seem rather clear. 

7 The instruction # already occured in [B], but was in [7] replaced by #0, thus admitting 
a more systematic treatment of "jumps" . 
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A PGA, a summary 

Let a set A of constants with typical elements a,b,c, . . . be given. PGA-programs 
are of the following form (a e A, k G N): 

P ::= a | +a | -a | #fc | ! | P;P\ P u . 

Each of the first five forms above is called a primitive instruction. We write U 
for the set of primitive instructions and we define each element of U to be a SPI 
(Sequence of Primitive Instructions). 

Finite SPIs are defined using concatenation: if P and Q are SPIs, then so is 

P;Q 

which is the SPI that lists Q's primitive instructions right after those of P, and we 
take concatenation to be an associative operator. 

Periodic SPIs are defined using the repetition operator: if P is a SPI, then 

is the SPI that repeats P forever, thus P; P; P; Typical identities that relate 

repetition and concatenation of SPIs are 

{P;Py=P" and {P;QY = P;{Q;Py. 
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Another typical identity is 
P";Q = P", 

expressing that nothing "can follow" an infinite repetition. 

The execution of a SPI is single-pass: it starts with the first (left-most) in- 
struction, and each instruction is dropped after it has been executed or jumped 
over. 

Equations for thread extraction on SPIs, notation \X\, are the following, where 
a ranges over A, u over the primitive instructions IA, and k £ N: 



1 


= S 


|!;^l 


= S 




|a| 


= a o D 


|a;X| 


= ao \X\ 




+a| 


= ao D 


|+a;X| 


= \X\ < a^ 


l#2;A 


—a 


= a o D 


|-a;X| 


= l#2;X|< 




#k\ 


= D 


|#0;X| 


= D 










= \x\ 








|#fc+2; W | 


= D 








|#fc+2;u;X| 







For more information on PGA we refer to [7l 1 1 4] . 

B Basic Thread Algebra and Finite Approxima- 
tions 

An elegant result based on [2] is that equality of recursively specified regular threads 
can be easily decided. Because one can always take the disjoint union of two 
finite linear recursive specifications, it suffices to consider a single specification 
{Pi = U | 1 < i < n}. Then P = P, follows from 

TTn-l(Pi) = TT n -i(Pj). 

Thus, it is sufficient to decide whether two certain finite threads are equal. We 
provide a proof sketch: 

For k > consider the equivalence relation on {Pi, . . . ,P n } defined by 
P = fe Pj if vr fe (P) = n k {Pj). Then 

=o 2 =i 2 =2 2 ■ ■ ■ (9) 

If ^ k = = k+1 then =k+i = =k+2- This follows from (J9j> and = fe+1 C = fe+2 . Sup- 
pose the latter is not true, then irk+i(Pi) = -Kk+i{Pj) while Ttk+2{.Pi) ^ ^k+2(Pj)- 



34 



The only possible cases are that Pj = P rn < a > p and P, = P m / < a > Pj/ and 
7r fc+ i(P m ) ^ 7r fc+ i(P m /) or 7r fc+ i(P) 7^ 7r fe+ i(P ; /). So by = fc = = fc+ i, at least 
one of 7Tfc(P m ) ^ 7Tfc(Pm') and ^(p) ^ ^k{Pv) must be true, but this refutes 
7Tfc + i(Pi) = 7Tfc + i(Pj). So, once the sequence © becomes constant, it remains con- 
stant. Since this sequence is decreasing and the maximum number of equivalence 
classes on {Pi, . . . , P„} is n, at most the first n relations in the sequence can be un- 
equal, hence = n -i = =„, and thus n n -i(Pi) = n n -i(Pj) implies 7r fc (P) = iTk{Pj) 
for all k £ N. 

It is not difficult to show for threads P and Q: if TTk(P) = 7Tfe(Q) for all k e N 
then P = Q. First, each (infinite) thread is a projective sequence on which 7Vk is 
defined componentwise. Secondly, for a projective sequence (P n )neN it follows that 
TTfc(Pfc) = 7r fc (7r fc (P fc+ i) = 7r fe (P fc+ i) = Pfc for all k G N. So, for (Q„)„ eN a projective 
sequence, Pk = Hk{Pk) = n k {Q) = Qk for all fc implies (P„)„ e N = (Qn)neN- 



C Thread-Service Composition 

Most of this text is taken from [14j . A service, or a siaie machine, is a pair (S, F) 
consisting of a set S of so-called co-actions and a reply function P. The reply 
function is a mapping that gives for each non-empty finite sequence of co-actions 
from E a reply true or false. 

Example 2. A stack can be defined as a service with co-actions push:i, topeq:i, 
and pop, for i = 1, . . . ,n for some n, where push:i pushes i onto the stack and 
yields true, the action topeq:i tests whether i is on top of the stack, and pop pops 
the stack with reply true if it is non-empty, and it yields false otherwise. 

Services model (part of) the execution environment of threads. In order to 
define the interaction between a thread and a service, we let actions be of the form 
cm where c is the so-called channel or focus, and m is the co-action or method. 
For example, we write s.pop to denote the action which pops a stack via channel 
s. For service Ji = (E, F) and thread P, P / C H represents P using the service H 
via channel c. The defining rules for threads in BTA are: 

s/ c n = s, 
d/ c h = d, 

(P < c'.m >Q)/ C H={P /c H) < c'.m > (Q / c H) if c' ^ c, 
(P < c.m>Q) / C H = P /c W if m e E and F(m) = true, 
(P <c.m\>Q) / C H = Q / C H' if m E E and F(m) = false, 
(P < cm >Q)/ C H=D ifm^E, 

where W = (E, F') with F'(cr) — F(ma) for all co-action sequences a G E + . 

The operator / c is called the use operator and stems from [8]. An expression 
P / c 7i is sometimes referred to as a thread- service composition. The use operator 
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is expanded to infinite threads in BTA°° by defining 

(Pn)ne® lc n = |J P n / c U. 

(Cf. [3].) It follows that the rules for finite threads are valid for infinite threads 
as well. Observe that the requests to the service do not occur as actions in the 
behavior of a thread-service composition. So the composition not only reduces 
the above-mentioned non-determinism of the thread, but also hides the associated 
actions. 

In the next example we show that the use of services may turn regular threads 
into non- regular ones. 

Example 3. We define a thread using a stack as defined in Example^ We only 
push the value 1 (so the stack behaves as a counter), and write S(n) for a stack 
holding n times the value 1. By the defining equations for the use operator it follows 
that for any thread P , 

{s.push:l o P) / s S(n) = P /„ S(n+1), 
(P < s.pop > S) A 5(0) = S, 
(P < s.pop > S) Is S(n+1) =P/ S S(n). 

Now consider the regular thread Q defined by 

Q = s.push:l oQ<\a\>R, R = b o R < s.pop > S, 

where actions a and b do not use focus s . Then, for all neN, 

Q /s S{n) = {s.push:l oQ<o^J?)/ s S(n) 

= {Q/ S S{n+l))<a>{R/ s S{n)). 

It is not hard to see that Q j s 5(0) is an infinite thread with the property that for 
all n, a trace ofn + l a-actions produced by n positive and one negative reply on a 
is followed by b n o S. This yields an non-regular thread: if Q / s S(0) were regular, 
it would be a fixed point of some finite linear recursive specification, say with k 
equations. But specifying a trace b k o S already requires k + 1 linear equations 
x\ = box2,...,Xk — b o Xk+i,Xk+i — S, which contradicts the assumption. So 
Q /s S(0) is not regular. 

Finally, we note that the use of finite state services, such as Boolean registers, 
can not turn regular threads into non-regular ones (see [8j). More information on 
thread-service composition can be found in e.g. [14] . 
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